Manuzio: An Object Language for Annotated Text Collections

نویسندگان

  • Marek Maurizio
  • Renzo Orsini
چکیده

Traditionally, text collections are represented as text files with some kind of markup to define extra-textual information, like metadata, annotations, etc. We propose an approach which uses the natural structure of a literary text to build specialized objects abstractions on text collections, objects which can be used to make non-hierarchically nested, multi-level annotations, to create complex metadata, and to perform complex queries and analysis on the collection. The language Manuzio is the result of this approach, and in this paper we introduce its main features, as well as the sketch of a system, based on the language, to manage persistent text collections and write complex applications over them.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Model and a Language for Representing and Manipulating Annotated Text Collections

Traditionally, collections of texts are digitally represented as a set of documents containing the text along with some kind of markup to define extra information, like metadata, annotations, etc. We propose a different approach that models the textual information in a dual way: as a formatted sequence of characters, as well as a composition of a particular kind of objects, called textual objec...

متن کامل

Deriving a Priori Co-occurrence Probability Estimates for Object Recognition from Social Networks and Text Processing

Certain components in images can be recognized with high accuracy, for example, backgrounds such as leaves, grass, snow, sky, water. These components provide the human eye with context for identifying items in the foreground. Likewise for the machine, the identification of background should help in the recognition of foreground objects. But, in this case, the computer needs explicit lists of ob...

متن کامل

Translating Images to Words for Recognizing Objects in Large Image and Video Collections

We present a new approach to the object recognition problem, motivated by the recent availability of large annotated image and video collections. This approach considers object recognition as the translation of visual elements to words, similar to the translation of text from one language to another. The visual elements represented in feature space are categorized into a finite set of blobs. Th...

متن کامل

Compilation of a Mexican Spanish text corpora

-Collections of texts with syntactic annotation are nowadays useful resources. They are employed for diverse tasks in theoretical research and natural language applications. The most important collections are dedicated to English. But huge efforts have being realized to develop the corresponding to other languages. In this work we present the initial steps for the compilation of a Mexican Spani...

متن کامل

Object-based Annotations for Discovery and Collaboration

This paper discusses a design for object-based interaction and manipulation for annotating a text discovery application. Rather than attaching annotations to the interface or directly annotating the interface, objects from the interface can be directly annotated and copied in to a collection to be viewed outside the context of the main interface. Objects are smaller chunks of the interface whic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010